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procedures. The figure is an autoradiogram of restriction fragments, separated by gei electrophoresis, of 
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Methods for Identification and Isolation ot 
Specific Nucleotide Sequences in cDNA and Genomic DNA 

BACKGROUND OF THE INVENTION 

Field of the Invention 

5 This invention is in the field of molecular and cellular biology. In general, the invention 

is related to a method for the identification and isolation of specific genetic sequences or genetic 
markers from the genomic DNA or cDN A of an organism In particular, the invention is related 
to a method whereby a DNA fragment from a first sample of genomic DNA or cDNA, not found 
in a second sample of genomic DNA or cDNA, may be identified and isolated via a series of 
10 digestion, amplification, purification and sequencing steps. This invention has utility in the 
identification and isolation of genomic DNA or cDNA sequences that may serve as genetic 
markers for use in a variety of medical, forensic, industrial and plant breeding procedures. 

Related Art 

Genomic DNA 

15 In examining the structure and physiology of an organism, tissue or cell, it is often 

desirable to determine its genetic content. The genetic framework (i.e., the genome) of an 
organism is encoded in the double-stranded sequence of nucleotide bases in the deoxyribonucleic 
acid (DNA) which is contained in the somatic and germ cells of the organism. The genetic 
content of a particular segment of DNA, or gene, is only manifested upon production of the 

20 protein which the gene ultimately encodes. In order to produce a protein, a complementary copy 
of one strand of the DNA double helix (the "sense" strand) is produced by polymerase enzymes, 
resulting in a specific sequence of messenger ribonucleic acid (mRNA). This mRNA is then 
translated by the protein synthesis machinery of the cell, resulting in the production of the 
particular protein encoded by the gene. There are additional sequences in the genome that do not 

25 encode a protein (/.<?., "noncoding" regions) which may serve a structural, regulatory, or unknown 
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function. Thus, the genome of an organism or cell is the complete collection of protein-encoding 
genes together with intervening noncoding DNA sequences Importantly, each somatic cell of a 
multicellular organism contains the full complement of genomic DNA of the organism, except in 
cases of focal infections or cancers, where one or more xenogeneic DNA sequences may be 
5 inserted into the genomic DNA of specific cells and not into other, non-infected, cells in the 
organism. As noted below, however, the expression of the genes making up the genom.c DNA 
may vary between individual cells. 

cDNA and cDNA Libraries 

Within a given cell, tissue or organism, there exist myriad mRNA species, each encoding 

0 a separate and specific protein. This fact provides a powerful tool to investigators interested in 
studying genetic expression in a tissue or cell - mRNA molecules may be isolated and farther 
manipulated by various molecular biological techniques, thereby allowing the elucidation of the 
full functional genetic content of a cell, tissue or organism. 

One common approach to the study of gene expression is the production of 

1 complementary DNA (cDNA) clones. . In this technique, the mRNA molecules from an organism 
are isolated from an extract of the cells or tissues of the organism This isolation often employs 
solid chromatography matrices, such as cellulose or hydroxyapatite, I which oligomers of 
deoxythymidine (dT) have been complexed Since the 3' termini on all eukaryotic mRNA 
molecules contain a string of deoxyadenosine (dA) bases, and since dA binds to dT. the mRNA 
molecules can be rapidly purified from other molecules and substances in the tissue or cell extract 
From these purified mRNA molecules, cDNA copies may be made using the enzyme reverse 
transcriptase, which results in the production of single-stranded cDNA molecules. The single- 
stranded cDNAs may then be converted into a complete double-stranded DNA copy of the 
original mRNA (and thus of the original double-stranded DNA sequence, encoding this mRNA, 
contained in the genome of the organism) by the action of a DNA polymerase. The protein- 
specific double-stranded cDNAs can then be inserted into a plasmid, which is then introduced into 
a host bacterial cell. The bacterial cells are then grown in culture media, resulting in a population 
of bacterial cells containing (or in many cases, expressing) the gene of interest. 
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This entire process, from isolation of mRNA to insertion of the cDNA into a plasmid to 
growth of bacterial populations containing the isolated gene, is termed "cDNA cloning." If 
cDNAs are prepared from a number of different mRNAs, the resulting set of cDNAs is called a 
"cDNA library," representing the different functional (i.e., expressed) genes present in the source 
5 cell, tissue or organism. Genotypic analysis of these cDN A libraries can yield much information 
on the structure and function of the organisms from which they were derived. 

DNA Fingerprinting 

To determine the genotype of an organism, tissue or cell, a variety of molecular biological 
techniques are employed. These techniques allow researchers, clinicians, forensic scientists and 

10 others to probe for the presence of specific genes in the samples which are being studied. The 
results of such analyses may be useful to researchers in examining the phylogenetic relationship 
between two organisms, to clinicians in determining whether an individual is infected with a 
particular disease or is a carrier of a disease-related gene, and to forensic scientists in analyzing 
crime scene evidence such as blood or other tissues. 

15 A technique often used in such genotypic analysis is known as DNA fingerprinting. This 

technique relies on the digestion of the DNA of an organism, tissue or ceil with a restriction 
endonuclease Inzyme which cleaves the DNA sample into fragments of discrete length Due to 
the specificity with which different restriction endonucleases cleave their DNA substrates, a given 
set of enzymes will always produce the same results, in terms of fragment number and size (the 

20 term "size" as used herein is defined as the length and/or molecular weight of a given restriction 
fragment), from a given DNA sample. The restriction fragments may then be resolved by a variety 
of techniques such as size exclusion chromatography, gel electrophoresis, or attachment to a 
variety of solid matrices. Most commonly, gel electrophoresis is performed, and the restriction 
fragments are resolved into a series of bands on the gel via their differential mobilities within the 

25 gel (which is inversely related to fragment size). The pattern of these bands within the gel is 
specific for a given DNA sample, and is often referred to as the "fingerprint" of that sample. 

When the DNA fingerprints of closely related organisms, tissues or even cells are 
compared, these fingerprints are often quite similar. However, subtle differences between the 
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fingerprints may be observed These differences, termed "DNA polymorphisms," tend to increase 
in number (i.e.. the fingerprints become more dissimilar) as DNA samples from more distantly 
related or unrelated organisms are compared. This technique of examining such Restriction 
Fragment Length Polymorphisms, or "RFLPs," has been used for a number of years in genotypic 
5 analysis of eukaryotes such as plants (Tanksley. S.D. el a/., Bio/Technology 7.257-264 (1989)) 
and animals, including humans (Botstein, D. el a/.. Am. J. Hum. Genet. 323 14-331 (1980)) In 
fact, RFLP analysis is being used in combination with other techniques in molecular biology to 
determine the complete structure (,c, the ".nap") of the human genome (See, e.g., Donis-Keller. 
H. et ai, Cell J/ 3 19-337 (1987)) ln this way. RFLP analysis can be used to determine the 
0 relationship, or lack thereof, between specific organisms, tissues or cells by a simple comparison 
of differences in their DNA fingerprints. 

DNA Amplification 

One early drawback to the use of RFLP analysis, however, was its requirement for larger 
amounts of DNA than are typically available in the samples to be analyzed. In addition, complex 

5 genomic samples are often difficult to analyze by RFLP, as a multitude of different DNA 
molecules are simultaneously fragmented and resolved As a means of overcoming these 
difficulties, investigators have increasingly turned to methods that increase the copy number of. 
or "amplify," specific sequences of DNA in a sample. 

A commonly used amplification technique is the Polymerase Chain Reaction ("PCR") 

0 method invented by Mullis and colleagues (U.S. Pat. Nos 4,683,195; 4.683,202. and 4,800,159) 
This method uses "primer" sequences which are complementary to opposing regions on the DNA 
sequence to be amplified. These primers are added to the DNA target sample, along with a molar 
excess of nucleotide bases and a DNA polymerase (e.g., Tac, polymerase), and the primers bind 
to their target via base-specific binding interactions (i.e., adenine binds to thymine, cytosine to 
guanine). By repeatedly passing the reaction mixture through cycles of increasing and decreasing 
temperatures (to allow dissociation of the two DNA strands on the target sequence, synthesis of 
complementary copies of each strand by the polymerase, and re-annealing of the new 
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complementary strands), the copy number of a particular sequence of DNA may be rapidly 
increased. 

Other techniques for amplification of target nucleic acid sequences have also been 
developed. For example, Walker et ai (U.S. Pat. No. 5,455,166; EP 0 684 315) described a 
method called Strand Displacement Amplification (SDA), which differs from PCR in that it 
operates at a single temperature and uses a polymerase/endonuclease combination of enzymes to 
generate single-stranded fragments of the target DNA sequence, which then serve as templates 
for the production of complementary DNA (cDNA) strands. An alternative amplification 
procedure, termed Nucleic Acid Sequence-Based Amplification (NASBA) was disclosed by 
Davey et ai (U.S. Pat. No. 5.409,818, EP 0 329 822). Similar to SDA, NASBA employs an 
isothermal reaction, but is based on the use of RNA primers for amplification rather than DNA 
primers as in PCR or SDA. 

P( li-hased DNA Fingerprinting 

Despite the availability of a variety of amplification techniques, most DNA fingerprinting 
methods rely on PCR for amplification, taking advantage of the well-characterized protocols and 
automation available for this technique. Examples of these PCR-based fingerprinting techniques 
include Random Amplified Polymorphic DNA (RAPD) analysis (Williams, J.G.K. etai % NucL 
Acids Res. I8(22):6SZ 1-6535 (1990)), Arbitrarily Primed PCR (AP-PCR, Welsh, J., and 
McClelland, M., NucL Acids Res. 75^:7213-7218 (1990)), DNA Amplification Fingerprinting 
(DAP, Caetano-Anolles et a/., Bio/Technology 9:553-557 (1991)), and microsatellite PCR or 
Directed Amplification of Minisatellite-region DNA (DAMD; Heath, D D et al. % NucL Acids Res. 
2/(2^:5782-5785 (1993)). All of these methods are based on the amplification of random DNA 
fragments by PCR, using arbitrarily chosen primers. The utility of these techniques is limited, 
however, by their extreme sensitivity to the quality of the target DNA, which may be poor in some 
genomic or cDNA library samples. Use of poor-quality (e.g., fragmented, degraded or otherwise 
non-intact) DNA in these techniques can lead, for example, to spurious results due to incomplete 
amplification of desired target DNA sequences. 
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More recently, a technique named Amplification Fragment Length Polymorphism (AFLP) 
analysis was developed by Vos and colleagues (EP 0 534 858; Vos, P. et at., Nucl. Acids Res. 
23(2 1):4407-44\4 (1995)). This technique, which is also PCR-based, uses specific combinations 
of restriction endonucleases and adapters of discrete sequences, as well as primers that contain 
5 the common sequences of the adapters. In this way, a sequence or fragment of DNA in a complex 
sample may be specifically amplified and used for further analysis The value of AFLP in genomic 
analyses of certain plant and bacterial strains has been demonstrated (Lin, J. -J., and Kuo, J., Focus 
1 7(2):b6-10 { 1 995); Lin, J. -J., et al. . Pkmt Moke. Biol. Rep. 14(2): 1 56- 1 69 ( 1 996)), while others 
have used AFLP for HLA-DR genotyping in humans (Yunis, 1. et a!.. Tissue Antigens 3S:78-88 
10 (1991)) 

Identification of Tissue-Specific cDNAs and Genomic Genetic Markers 

Despite the success of genetic mapping using the foregoing techniques, however, these 
methods are limited in their abilities to identify source-specific DNA sequences. This limitation 
is particularly true for those sequences derived from genomic DNA samples from different cells, 

15 tissues or organisms, and for those derived from tissue cDNA libraries which comprise only those 
DNA molecules that are actively expressed (i.e., used to make proteins) in the particular tissue 
and which are thus a subset of genomic DNA. For cDNA libraries, however, methods have been 
developed that overcome these limitations to some extent. 

One such method, termed differential hybridization, relies on the knowledge that specific 

20 genes are expressed differentially in certain cells or tissues as opposed to other cells or tissues 
To identify these cell- or tissue-specific genes, one can simply prepare cDNAs from two different 
cell or tissue types and separately hybridize the cDNA samples to oligonucleotide probes prepared 
from each of the samples The resultant hybridization patterns can then be compared, and any 
differences observed indicate the cell- or tissue-specific expression of one or more genes (and thus 

25 the presence, in a cDNA library prepared from that cell or tissue, of a specific cDNA). This 
technique was used to identify growth factor-regulated genes that are specifically expressed in 
cells stimulated to grow by treatment with serum but that are not expressed in quiescent cells 
(Lau, H.F., and Nathans, D , EMBOJ. 4:3 145-3 151 (1985)) 
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A second, somewhat more sensitive, technique for identifying tissue-specific DNAs is the 
use of subtractive libraries (See Hedrick, S.M. etaL, Nature 305:149-153 (1984); Lin, J. -J., et 
ai. FOCUS /Vf3J:98-10t (1993)). In this method, cDNAs prepared from the one tissue or cell 
type are mixed with the mRNAs from another, closely related, tissue or cell type. The cDNAs 

5 that are expressed in both cells or tissues then form DNA-RNA hybridization complexes, since 
they are complementary to each other, while the cDNAs expressed selectively in one cell/tissue 
but not the other will not form such a complex. The DNA-RNA complexes, representing cDNAs 
that are not tissue-specific, can then be removed from the mixtures (/.<?., "subtracted") by passing 
the mixture through a poly-dT or hydroxyapatite column, to which the unhybridized cDN As will 

10 not bind. This procedure thus results in a purified sample that is enriched in tissue- or cell-specific 
cDNAs 

A mplificatum-Based Cloning 

While differential hybridization and the use of subtractive libraries may be suitable for the 
identification of DNA sequences that are expressed at relatively high levels in the source cells or 
15 tissues, they are not particularly useful when the starting samples contain only low levels of 
genomic DNA (or mRNA used to make cDNAs). This problem is particularly important when 
the tissue or cell samples are themselves present in low quantities (as in many medical or forensic 
applications), or when the specific DNA sequence is expressed at low levels in the cell/tissue 
samples. 

20 PCR-based cloning of tissue-specific cDNAs has been used in the attempt to overcome 

the lack of sensitivity of earlier approaches (see, e.g., Lee, C.C., et al. y Science 239:1288-1291 
(1988)). However, this approach still suffers from the major shortcoming of PCR itself-- the 
requirement for prior knowledge of the nucleotide sequence of the DNA to be amplified, to allow 
construction of complementary PCR primers. Without knowing the nucleotide sequence of the 

25 target DNA, PCR cannot be performed in order to amplify this sequence in the sample. Since the 
target sequences are not known in many medical or forensic samples, PCR-based cloning is not 
useful for the identification or isolation of tissue-specific cDNAs from these samples. For the 
same reasons, these techniques are not suitable for the identification of previously unknown or 
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uncharacterized genes from cDNA libraries or genomic samples. Furthermore, as noted above, 
the complexity of genomic DNA limits the utility of these techniques in the identification and 
isolation of genetic markers from the genome of a cell or organism. 

Thus, there remains an unmet need for a rapid, reproducible and reliable technique for 

5 identifying fragments of DNA, or genes, that are unique to the genomes of specific organisms, 
tissues or cells, or that are unique to cDN A libraries prepared from these specific sources, without 
prior knowledge of the nucleotide sequence of the unique DNA fragments Particularly desirable 
are methods that would rapidly identify, and allow the isolation of. specific DNA sequences found 
in one source cDNA library or genome but not in another library or genome Such a technique 

0 would find utility in a variety of applications, particularly ,n clinical, forensic and plant breeding 
applications 

BRIEF SUMMARY OF THE INVENTION 

The present invention is directed to AFLP-based methods that address these unmet needs 
In particular, the invention relates to such methods that allow the identification and isolation of 
tissue-specific cDNAs from cDNA libraries, or the identification and isolation of specific genetic 
markers from samples of genomic DNA. 

In one embodiment, the invention is directed to a method for identifying a cDNA fragment 
from a first cDNA library which is not present in a second cDNA library, comprising the steps of 
(a) digesting a first and second cDNA library with at least one restriction enzyme to give a 
collection of restriction fragments, and (b) identifying one or more unique fragments from the first 
cDNA library by comparing the fragments from the first cDNA library to fragments from the 
second cDNA library. 

In another embodiment, the invention is directed to a method for identifying a genetic 
marker, comprising a DNA fragment from a first sample of genomic DNA, which is not present 
in a second sample of genomic DNA Th.s method comprises the steps of (a) digesting the first 
and second samples of genomic DNA with at least one restriction enzyme to give a collection of 
restriction fragments, and (b) identifying one or more unique DNA fragments in the first or second 
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samples of genomic DNA by comparing the fragments obtained from one sample of genomic 
DNA to those obtained from the other sample. 

According to the invention, the identifying step in the above methods is preferably 
accomplished by separating the restriction fragments according to size, which is as used herein 
5 is defined as the length and/or molecular weight of the restriction fragments. This aspect of the 
invention may further comprise sequencing the unique cDNA or genomic DNA fragments, and 
may entail amplification of the restriction fragments prior to the identifying step (b). In another 
aspect of the invention, the restriction fragments are detectably labeled. The present invention 
also encompasses the above method which further comprises the steps of (c) isolating at least one 

10 unique fragment, and (d) inserting the fragment into a vector, which may be an expression vector, 
for use in transfecting or transforming a prokaryotic or eukaryotic host cell; the fragment may be 
amplified prior to insertion into the vector. In another aspect of this embodiment, the unique 
fragment may be sequenced according to routine nucleotide sequencing methods. 

In another embodiment, the present invention provides a method for isolating a cDNA 

15 from a first cDNA library, comprising the steps of (a) mixing one or more of the unique 
fragments identified as summarized above, or one or more oligonucleotide probes which are 
complementary to the fragments, with a first cDNA library under conditions stringent for 
hybridization of the unique fragments or oligonucleotide probes to the first cDNA library; and (b) 
isolating a cDNA which is complementary to the unique fragments or to the oligonucleotide 

20 probes. Analogously, the invention also provides a method for isolating a genetic marker, 
comprising a DNA fragment, from a sample of genomic DNA. This method comprises the steps 
of (a) mixing one or more of the unique fragments identified as summarized above, or one or 
more oligonucleotide probes which are complementary to the fragments, with a sample of DNA 
under conditions stringent for hybridization of the unique fragments or oligonucleotide probes to 

25 the sample of DNA; and (b) isolating a DNA fragment which is complementary to the unique 
fragments or to the oligonucleotide probes. 

According to the present invention, the isolation steps (b) of the above-described methods 
may be accomplished by gel electrophoresis, density gradient centrifiigation, sizing 
chromatography, affinity chromatography, immunoadsorption, or immunoaffinity 



WO 98/08981 

PCT/US97/15355 

-10- 



chromatography , n this embodiment | sola , ed cDNA or DNA fragments mav also be 
sequenced, amplified, or inserted i„,„ a vector (which may be an expression vector) DNA 
fragments ,sola,ed by thts embodiment of the present invention wiN be useful in. for example the 
preparation of DNA or UNA probes, and to a,d ,„ a variety of medical, forensic, industna, and 

5 plant breeding applications 

The mvention *so encompasses the methods described above, where.n the amplifier,, 
o. the un.que cDNA or genom.c DNA fragments ,s accomplished by a method comprismg the 
steps of (.) ligating one „, n.ore adapter ohgonucleotides to a unique cDNA foment or genom,c 
DNA fragment to form a DNA-adapter complex; fl» hybridizmg the DNA-adapter compiev 
<» under stnngen, condit.ons. with one or more oligonucleotide pnmcrs which are compiementary 
to the adapter portion of the DNA-adapter complex to for,,, a hybridization compiex and (c, 
amplifyrng the DNA-adapter compiex ,„ ,h,s aspect of the invent.on. the adapter oligonucleotide 
may contam one or more restncuon sites which mav be used to insert the DNA-adapter complex 
into a vector 

^ According to the present invention, the firs, and second cDNA libraries or sampies of 

genomrc DNA used in the above-descnbed methods may be derived from an mdividual cell (which 
may be prokaryouc or eukaryotic). a t.ssue (wh,ch may be a plan, or an animal tissue most 
prelerably a human tissue mduding a huU embryonic or fetal tissue,, an organ or a whole 
organtsm The genetic marker identified according ,o this embodiment of the invention mav be 
a cancer marker, an infectious disease marker, a genetic disease marker, a marker ofembrvonJc 
developmen, a tissue-specific marker or m enzyme marker .„ „„ e such aspect of the .nvention 
one CDNA hbrary or sample of genom,c DNA may be denved from an animal suffering from an 
■nfecous disease (.,. a dtsease of bacteria,, fungal. ™| or parasitic origtn, and the other cDNA 
Hbrary or sample of genomtc DNA may be from an animal no, suffering from an .nfectious 
d.sease In another aspect, one cDNA Hbrary or sample of genomic DNA may be derived from 
an an.m* suffering from cancer and the other may be derived from an animal no, suffering from 
cancer. ,„ another aspect, one cDN A libra^ or sample of genomic DNA may be obtained from 
a cancerous anima, tissue and the other from a noncancerous anima, tissue, which tissues may 
°o,h be obta.ned from the same animal. In another aspect, one cDNA hbrary or sample of 
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genomic DNA may be from an animal suffering from a genetic disease and the other may be from 
an animal not suffering from a genetic disease. In another aspect, one cDNA library or sample 
of genomic DNA may be derived from a pathogenic microorganism and the other from a non- 
pathogenic organism. In another aspect, one cDNA library or sample of genomic DNA may be 
5 derived from an organism expressing an enzyme, and the other sample may be derived from an 
organism not expressing an enzyme. In another aspect, one cDNA library or sample of genomic 
DNA may be derived from an organism expressing an industrially useful protein, and the second 
may be derived from an organism not expressing an industrially useful protein. In another aspect, 
one cDN A library or sample of genomic DNA may be derived from a diseased plant and the other 
10 sample may be derived from a non-diseased plant. In another aspect, one cDNA library or sample 
of genomic DNA may be from a plant resistant to an environmental stress, which may be drought, 
excess temperature, diminished temperature, chemical toxicity by herbicides, pollution, excess 
light or diminished light, and the other sample may be from a plant not resistant to an 
environmental stress. 

15 In another embodiment, the present invention provides a method of determining the 

relationship between a first individual and a second individual comprising the steps of (a) digesting 
a cDNA library or a sample of genomic DNA obtained from the first and second individuals with 
at least one restriction enzyme to give a collection of restriction fragments; (b) separating the 
restriction fragments from the first and second individuals according to size; and (c) determining 

20 the similarities and dissimilarities of the sizes or concentrations of the restriction fragments 
separated in step (b). In a preferred aspect of this embodiment, this comparison is accomplished 
by computer analysis. 

Other preferred embodiments of the present invention will be apparent to one of ordinary 
skill in light of the following drawings and description of the invention, and of the claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is an autoradiogram of ''P-labeled EcoRlMse\ restriction fragments, separated 
by gel electrophoresis (5% poiyacrylamide + 8M urea sequencing gel), of samples from a human 
brain cDNA library (500 nanograms per sample) containing increasing amounts of 
5 pCMVSPORTCAT cDN A. Lane I , pCMVSPORTC AT control (no brain cDN A); Lanes 2-6. 
500 ng of human brain cDNA containing the following amounts of pCMVSPORTCAT cDNA 
Lane 2. 0.3 ng, Lane 3, 3 ng. Lane 4, 30 ng; Lane 5. 300 ng; Lane 6, 0 ng (brain cDNA control). 
Arrows indicate pCMVSPORTCAT-specific fragments 

Figure 2 is an autoradiogram of "P-labeled EcoRl/Msel restriction fragments, separated 
10 by gel electrophoresis (5% poiyacrylamide + 8M urea sequencing gel), of samples prepared from 
cDNA libraries of human liver, leukocytes, kidney or brain (500 nanograms per sample). Lanes 
I. 2: liver; Lanes 3, leukocytes; Lanes 5, 6: kidney; Lanes 7, S brain. Arrow indicates a unique 
DNA fragment detected in brain cDNA 

Figure 3 is an autoradiogram of "P-labeled KcoRLUsvl restriction fragments, separated 
15 by gel electrophoresis (5% poiyacrylamide + 8M urea sequencing gel), of samples from human 
genomic DNA from four pairs of identical twins (matched in lanes 1 and 2; lanes 3 and 4; lanes 
5 and 6; lanes 7 and 8 of each panel), using the teoRl primer shown in SEQ rD NO: I, and either 
the Mse\ primer shown in SEQ ID NO:2 (panel A) or the Mse] primer shown in SEQ ID N0 3 
(panel B). Lane 1 unaffected; Lane 2: matched twin, schizophrenic. Lane 3: schizophrenic; lane 
20 4: matched twin, unaffected Lane 5: schizophrenic; lane 6; matched twin, unaffected Lane 7: 
bipolar; lane 8: matched twin, unaffected M: DNA sizing markers 

Figure 4 is an autoradiogram of ^-labeled EcoRl/Msel restrict.on fragments, separated 
by gel electrophoresis (5% poiyacrylamide + 8M urea sequencing gel), of samples prepared from 
genomic DNA of Agrobacterium tumefaciem strain C58 (lanes 1, 2) or strain A 136 (lanes 3, 4) 
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Figure 5 is an autoradiogram of a Southern blot, using 32 P-labeled C58 hybridization 
probes, of EcoRl (lanes 2-7) or EcoRl/Mse\ (lanes 8-1 1) restriction fragments of plasmid or 
genomic DNA from various strains of A. tumefaciens. Lane 1: 1 kilobase marker; lane 2: pTi58 
sample; lane 3: pTiA6 sample; lanes 4 and 8. C58 genomic DNA samples; lanes 5 and 9: A 136 
5 genomic DNA samples, lanes 6 and 10: LBA4404 (strain Ach5) genomic DNA samples; lanes 7 
and 1 1: A6 genomic DNA samples. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides a method for identifying and isolating unique DNA 
fragments or genes from genomic DNA samples. It will be readily appreciated by those skilled 
10 in the art that using the methods of this invention, any genomic DNA fragment comprising a 
sequence of contiguous nucleotide bases that is specifically contained within a given host genome 
may be identified and isolated. 

Sources of cDNA Libraries and Genomic DNA 

cDNA libraries and genomic DNA, as well as sources from which cDNA libraries lind 

15 genomic DNA rtiay be prepared, are available commercially from a number of sources, including 
Life Technologies, lnc (Rockville, Maryland), American Type Culture Collection (ATCC; 
Rockville, Maryland), Jackson Laboratories (Bar Harbor, Maine), Cell Systems, Inc. (Kirkland, 
Washington) and Advanced Tissue Sciences (La Jolla, California). Ceils that may be used as 
starting materials for cDNA and genomic DNA preparation may be prokaryotic (bacterial, 

20 including members of the genera Escherichia, Serratia, Salmonella, Staphylococcus, 
Streptococcus, Clostridium, Chlamydia, Neisseria, Treponema, Mycoplasma, Borrelia, 
Bordetella, Legionella, Pseudomonas, Mycobacterium, Helicobacter, Agrohacterium\ 
Collectotrichum, Rhizobium, and Streptomyces) or eukaryotic (including fungi or yeasts, plants, 
protozoans and other parasites, and animals including humans and other mammals). Any 

25 mammalian somatic cell may also be used for preparation of cDNA libraries and genomic DNA, 
including blood cells (erythrocytes and leukocytes), endothelial cells, epithelial cells, neuronal cells 
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(from the central or peripheral nervous systems), muscle cells (including myocytes and myoblasts 
from skeletal, smooth or cardiac muscle), connective tissue cells (including fibroblasts, adipocytes, 
chondrocytes, chondroblasts, osteocytes and osteoblasts) and other stromal cells (e.g., 
macrophages, dendritic cells, Schwann cells Mammalian germ cells (spermatocytes and oocytes) 
5 may aiso be used for the preparation of cDNA libraries and genomic DNA, as may the 
progenitors, precursors and stem cells that give rise to the above-described somatic and germ 
cells. Also suitable for use in the preparation of cDNA libraries and genomic DNA are 
mammalian tissues or organs such as those derived from brain, kidney, liver, pancreas, blood, 
bone marrow, muscle, nervous, skin, genitourinary, circulatory, lymphoid, gastrointestinal and 
10 connective t.ssue sources, as well as those derived from a mammalian (including human) embryo 
or fetus. These cells, tissues and organs may be normal, or they may be pathological such as those 
involved in infectious diseases (caused by bacteria, fangi or yeast, viruses (including HIV) or 
parasites), in genetic or biochemical pathologies (e.g., cystic fibrosis, hemophilia, Alzheimer's 
disease, schizophrenia, muscular dystrophy or multiple sclerosis), or in cancerous processes 
15 The methods of the invention may comprise one or more steps. For example, the 

invention is directed to a method for identifying a DNA fragment from a first cDNA library or 
sample of genomic DNA, which fragment is not present in a second cDNA library or sample of 
genomic DNA, comprising: (a) digesting the first and second cDNA libraries or samples of 
genomic DNA with at least one restriction enzyme to give a collection of restriction fragments; 
20 and (b) identifying one or more unique fragments from the first cDNA library or sample of 
genomic DNA by comparing the fragments from the first cDNA library or sample of genomic 
DNA to the fragments from the second cDNA library or sample of genomic DNA. Analogously, 
the invention is directed to a method for identifying a DNA fragment from a second cDNA library 
or sample of genomic DNA, which fragment is not present in a first cDNA library or sample of 
25 genomic DNA, comprising: (a) digesting the first and second cDNA libraries or samples of 
genomic DNA with at least one restriction enzyme to give a collection of restriction fragments; 
and (b) identifying one or more unique fragments from the second cDNA library or sample of 
genomic DNA by comparing the fragments from the second cDNA library or sample of genomic 
DNA to the fragments from the first cDNA library or sample of genomic DNA 
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In one aspect of the invention, one cDNA library or sample of genomic DNA may be 
derived from a sample from an animal suffering from an infectious disease (e.g., a disease of 
bacterial, tungal, viral or parasitic origin) and the other sample of genomic may be from an animal 
not suffering from an infectious disease. In another aspect, one cDNA library or sample of 
5 genomic DNA may be derived from an animal suffering from cancer and the other may be derived 
from an animal not suffering from cancer. In another aspect, one cDNA library or sample of 
genomic DNA may be obtained from a cancerous animal tissue and the other may be obtained 
from a noncancerous animal tissue, which tissues may both be obtained from the same animal 
In another aspect, one cDNA library or sample of genomic DNA may be from an animal suffering 

10 from a genetic disease and the other cDNA library or sample of genomic DNA may be from an 
animal not suffering from a genetic disease. In another aspect, one cDNA library or sample of 
genomic DNA may be obtained from a pathogenic microorganism and the other library or sample 
may be obtained from a non-pathogenic microorganism. In another aspect, one cDNA library or 
sample of genomic DNA may be derived from an organism expressing an enzyme, and the other 

15 may be derived from an organism not expressing an enzyme. Particularly preferred in this aspect 
of the invention are cDNA libraries and samples of genomic DNA from organisms with differential 
expression of a restriction enzyme, an enzyme degrading a petroleum product, a biodegradative 
enzyme, a nucleic acid polymerase enzyme, a nulleic acid ligase enzyme, an amino acid synthetase 
enzyme and an enzyme involved in carbohydrate fermentation; it is to be understood, however, 

20 that cDNA libraries or samples of genomic DNA from organisms with differential expression of 
any enzyme may be used in the methods of the present invention. In another aspect, one cDNA 
library or sample of genomic DNA may be derived from an organism expressing an industrially 
useful protein, and the second may be derived from an organism not expressing an industrially 
useful protein. Particularly preferred in this aspect of the invention are cDNA libraries and 

25 samples of genomic DNA from organisms with differential expression of proteins used in food and 
beverage manufacturing (e.g. y enzymes, flavorings, preservatives, bulking agents and the like), and 
those used in chemical and pharmaceutical manufacturing (particularly enzymes, cofactors, 
carriers, immunogens, preservatives, bulking agents and the like). In another aspect, one cDNA 
library or sample of genomic DNA may be derived from a diseased plant and the other may be 
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derived from a non-diseased plant. In another aspect, one cDNA library or sample of genomic 
DNA may be from a plant resistant to an environmental stress, which may be drought, excess 
temperature, diminished temperature, chemical toxicity by herbicides, pollution, excess light or 
diminished light, and the other may be from a plant not resistant to an environmental stress Other 
5 suitable sources of cDNA libraries and samples of genomic DNA will be apparent to one of 
ordinary skill, 

Once the starting cells, tissues, organs or other samples are obtained, cDNA libraries and 
genomic DNA may be prepared therefrom by methods that are well-known in the art (See. for 
example, Maniatis, T. etaL. Cell 15 687-701 (1978); Okavama. H . and Berg, P., Mo I. Cell. Biol. 

0 2 161-170(1982); Gubler. U.. and Hoffman, B.J.. Gene 25:263-269 ( 1983); Maniat.s. T , etaL. 
Molecular Cloning. A laboratory Manual , Cold Spnng Harbor. New York Cold Spring Harbor 
Laboratory Press, pp. 9. 16-9.23 (1989), Kaufman. P.B.. et ai. Handbook of Molecular and 
Cellular Methods in Biology and Med,cine, Boca Raton. Florida: CRC Press, pp 1-26 (1995), 
the disclosures of which are incorporated herein by reference in their entireties). The cDNA 

5 libraries and genomic DNA samples thus prepared, or those obtained from commercial sources, 
may then be used to identify and isolate unique cDNA and genomic DNA fragments (i.e.. tissue- 
specitlc fragments or genetic markers) by the methods of the present invention 

Purification of cDNA 

Having obtained cDNA libraries from various tissues, either from commercial sources or 
by preparation as taught above, the cDNA molecules are purified in preparation for analysis by 
AFLP Detailed methodologies for purification of cDNAs are taught in the GENETRAPPER™ 
manual (LT1; Gaithersburg. Maryland), which is incorporated herein by reference in its entirety 
Bactenal hosts (E. coli is commonly used, although another suitable bacterial or yeast host may 
also be used) containing therein plasmids comprising cDNAs of interest are grown in culture at 
an appropriate temperature (30-37'C, depending upon the specific bactenal host used) overnight, 
preferably for 12-24 hours, and most preferably for 18-24 hours Any culture medium promoting 
rapid growth of the host cells is suitable for use, although a tryptone-based broth culture is 
preferred and most preferred is TBG broth containing 1-2% tryptone. 2-5% yeast extract, 0. 1-1% 
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glycerol, 10-50 niM glucose, and concentrations of buffer salts that are standard in the art. Such 
culture media arc available commercially, for example from GIBCO/BRL (Gaithersburg, 
Maryland). 

After growth, culture broth containing host cells is transferred to sterile centrifuge 
5 containers (tubes or bottles) and is centrifuged at 10,000-20,000 x#, most preferably at 16,000 
x for 10-15 minutes at 4°C. Supernatants are then completely removed, by aspiration or 
pouring off, taking care not to dislodge, resuspend or otherwise disturb pelleted host cells in the 
bottoms of the centrifuge containers. 

Host cell pellets are then subjected to a procedure to liberate plasmids containing cDNAs. 

10 Procedures commonly used to perform this task include the use of lysozyme and physical pressure 
(the "French Press"), freeze-thawing, or sonication, these procedures, however, will often lead 
to shearing or fragmenting of target cDNAs which is undesirable. A more preferred method to 
free cDNA-containing plasmids from host cells is via alkaline lysis of the host cells, as it results 
in less degradation of the target cDNA molecules. In this method, cell pellets are resuspended 

15 in a low ionic strength buffer containing an alkali salt and a detergent. For example, a volume of 
a solution (hereinafter "TE buffer") containing about 50 mM glucose, about 50 mM TRIS®-HCI 
(pH 8.0), and about 10 mM disodium ethylenediaminetetraacetate (EDTA) is most preferably 
used for resuspension of cell pellets. Two volumes of alkaline-detergent solution dire then added 
to this cell suspension to promote lysis of the host cells and liberation of cDNAs; most preferable 

20 for this step is a solution of about 0.2 N sodium hydroxide and about 1% sodium dodecylsulfate, 
although any alkaline-detergent solution of approximately equivalent pH and ionic strength may 
be used. 

After addition of the alkaline-detergent solution, the suspension in the centrifuge container 
is thoroughly mixed and then incubated at 0-4°C (preferably in an ice bath) for approximately five 
25 minutes. Following this incubation, the lysis solution is neutralized by the addition of an acid salt; 
addition of Vi volume of about 3M potassium acetate (pH 4.8) is preferred with the above 
alkaline-detergent solution, although any acid salt of equivalent pH and ionic strength may also 
be used. The solution in the centrifuge container is then gently mixed and centrifuged under the 
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same conditions as described above for pelleting the cells, to remove cellular debris from cDNA- 
containing plasmids which remain in the supernatants. 

Supernatants are then withdrawn (and pellets discarded) and transferred to a fresh, sterile 
container To effectuate precipitation of cDNAs, two volumes of absolute ethanol are added to 
the supernatants, and the mixtures are then incubated for 5-30 minutes, preferably for 10-15 
minutes, at -20° to -70°C. most preferably in a bath containing dry ice and ethanol The mixtures 
are then centrifuged (again as above) to pellet precipitated cDNAs. Supernatants are removed 
by aspiration, again taking care to prevent disruption of the pellets, and pellets are resuspended 
n a buffer solution, preferably a solution containing about 10 tmVl TRJS®-HCI (pH 8.0) and 



i 

about I mM EDTA 



The ethanol precipitation step described above will also result in the precipitation of RNA 
molecules from the host cell, which will interfere with subsequent amplification and analysis of 
tissue-specific cDNAs To remove these unwanted RNAs, the samples may be treated with an 
RNA-degrading enzyme such as RNase A (available commercially, for example, from 
GEBCO/BRL, Gaithersburg, Maryland), which must be substantially free of contaminating DNase 
enzymes to prevent degradation of the target cDNAs Following treatment with RNase A, 
cDNAs are isolated by extracting the solutions with phenol, reprecipitation with ethanol and 
recentrifugationl according to methods that are well-known in the art (Lin and Kuo, Focus 
17(2) 66-70 (1995)). The final pellets, containing purified cDNAs, are then used for AFLP 
analysis. 

Identification of Tissue-Specific cDNAs and Genetic Markers 

Purified cDN A and genomic DNA may be examined by AFLP for identification of specific 
(including tissue-specific) cDNAs or genetic markers according to the present invention. AFLP 
was originally developed as a method for DNA fingerprinting analysis of bacterial, yeast, plant and 
animal cells (EP 0 534 858; Vos, P., etal., Nucl. Acids Res. 23(2 1):4401-44\4 (1995); Lin, J.-J., 
andKuo, J.,R?C7y5/7(2;:66-70(1995);Ltn, L-l, et al.. Plant Molec. Biol. Rep. 14(2).\$6-\69 
(1996)). In the present invention, the AFLP technique has been modified to provide, in one 
embodiment, a method for identifying a tissue-specific cDNA from a cDNA library, or a genetic 
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marker from a sample of genomic DNA and, in another embodiment, a method for isolating these 
specific cDNAs or genetic markers. 

AJFLP may be carried out using a commercially available system such as the AFLP 
Analysis System I (Life Technologies, Inc.; Rockville, Maryland) which contains a detailed 
5 methods manual, the disclosure of which is fully incorporated herein by reference. Alternatively, 
AFLP analysis may be performed using a combination of materials and methods that are modified 
from those commonly used in the art (Vos, P., et al^ NucL Acids Res. 23(2 1)A401-44 14 (1095); 
Lin, J. -J., and Kuo, J., I-OCUS 17(2) 66-10 (1995); Lin, J. -J., etal.. Plant Moke. Biol. Rep. 
14(2): 156-169 (1996)). 

10 The power of the AFLP technique is based on its use of generic primers and "adaptors" 

which allow amplification of DNA fragments without any prior knowledge of the nucleotide 
sequences of those fragments. In this way, the AFLP-based method of the present invention is 
more useful for identification of previously unknown tissue-specific cDNAs, and genomic genetic 
markers, than is traditional PCR which requires prior knowledge of the nucleotide sequence of 

15 the target DNA in order to design appropriate amplification primers. 

In the initial step of AFLP, purified cDNA or genomic DNA is digested with a panel of 
enzymes usually containing two restriction enzymes. Ordinarily, the two restriction enzymes have 
sequence specificities sufficiently different from one another so as to prevent overlap of digestion 
(and thus over-degradation) of the target DNA sequences. For example, the enzymes EcoRl and 

20 Msel may be used in combination to digest target DNA, as the restriction site specificities of these 
two enzymes are significantly different. However, other combinations of restriction enzymes may 
be used in carrying out the present invention with equal likelihood of success. 

Once the cDNA or genomic DNA has been digested with restriction enzymes (producing 
"restriction fragments," hereinafter referred to as "RFs"), the resultant RFs are ligated with 

25 adaptor sequences which extend the region on the RFs to which the PCR primers will bind during 
amplification, thus forming DN A-adaptor complexes. The use of adaptors is necessary since after 
digestion, the cut ends of the RFs, to which the PCR primers will bind, are often too short for 
optimal binding of the primers. Accordingly, ligation of adaptor sequences to the cut ends of the 
RFs extends the length of these primer binding sites, improving the efficiency of primer binding 
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and thus of amplification. The nucleotide sequences of these adaptors are chosen so as to contain 
the nucleotide sequences at the restriction sites in the target cDNA or genomic DNA samples. 
The adaptors usually will have a stretch of 2-8 contiguous nucleotides which are complementary 
to the cut ends of the RFs; thus, the adaptors bind to the RFs via normal DNA base-pairing and 
thereby extend the terminal sequence of the RFs 

Once the adaptors have been ligated to the DNA RFs, the fragments are amplified via PCR 
according to standard methods used for cDNA fragment amplif.cat.on (Lin, J -J , and Kuo. J., 
hO( IIS //T^.66-70 (1995)), using PCR primer oligonucleotides that hybridize to the adaptor 
portions of the DNA-adaptor complexes </.,., the binding regions of the primers are 
complementary to the sequences of the adaptors) under conditions used for PCR This approach 
provides the additional advantage that the actual sequences of the cDNA or genomic DNA 
fragments that are the targets for amplification need not be known, since the primers are designed 
to be specific for a restriction site rather than a particular gene. Accordingly, generic primers may 
be used, with their nucleotide sequences being dependent upon the combination of restriction 
enzymes used to digest the target cDNAs or genomic DNAs, as has been described for cDNAs 
(Vos, P., et a/., Nucl. Acids Re,. 23f2/j:4407-4414 ( 1995); Lin. J -J . and Kuo, J , I-OCIJS 
17(2):66-7Q (1995); Lin. J. -J., at a/., Plant Moke. Biol. Rep. 14(2): 156-169 (1996)) For 
example, KcoRl pnmers contain the sequence of the hcoRl restriction site (underlined below) 
coupled to core sequences and arbitrary extenders of three-base repeat units 
S'-CAU CAU CAU CAU GAC TGC GTA CCA ATT C -3 1 
[(CAU.,)£coRl-0] primer (SEQ ID NO: I) 

5'-GAC TGC GTA CCA ATT C AC C-3 1 

[KcoRl+ ACC] primer (SEQ ID NO: 2) 

Similarly, the Msel primers will contain the nucleotide sequence of the Mvel restriction 
site linked to different core and extender sequences: 
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5*-CUA CUA CUA CUA GAT GAG TCC TGA G TA A -3' 
[(CUAJMsel+O] primer (SEQ ID NO: 3); 

5 f -GAT GAG TCC TGA G TA A CA A-3' 
[MwI+CAA] primer (SEQ ED NO: 4); or 

5'-GAT GAG TCC TGA G TA A CA C-3' 
[Mcl+CAC] primer (SEQ ID NO:5). 

These primers may be detectably labeled, preferably with a radioisotope of phosphorus 
( n P or 33 P), although other detectable labels such as fluorophors or phosphors, enzymes, or 
biotin/avidin may be used as welt. 

Following amplification, the samples are prepared for separation of the DNA fragments, 
a procedure which permits the determination of the expression of tissue- or cell-specific cDNAs 
in the cDNA libraries, or the presence of specific genetic markers in the genomic DNA samples. 
The fragments may be separated by any physical or biochemical means including gel 
electrophoresis, chromatography (including sizing, affinity and immunochromatography), density 
gradient centrifugation and immunoadsorption. In the practice of the present invention, 
separation of DNA fragments by gel electrophoresis is particularly preferred, as it provides a rapid 
and highly reproducible means of sensitive separation of a multitude of DNA fragments, and 
permits direct comparison of the fragments in several cDNA libraries or samples of genomic DNA 
simultaneously. 

Gel electrophoresis is typically performed on agarose or polyacrylamide sequencing gels 
according to standard protocols (Lin, J. -J., and Kuo, J., FOCUS I7(2):66-7Q (1995)), preferably 
using gels containing polyacrylamide at concentrations of 3-8% and most preferably at about 5%, 
and containing urea at a concentration of about 8M. Samples are loaded onto the gels, usually 
with samples containing cDNAs or genomic DNA fragments prepared from different sources 
being loaded into adjacent lanes of the gel to facilitate subsequent comparison. 
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Following electrophoretic separation, DNA fragments may be visualized and identified by 
a variety of techniques that are routine to those of ordinary skill in the an. In a first such 
technique, the gel is dried using a commercial gel dryer and exposed to X-ray (for detection of 
radioisotopes) or high-sensitivity photographic (for detection of fluorophors or phosphors) film 
5 After development, the film is examined for the pattern of bands in each lane of the gel. each band 
corresponding to a different DNA species or fragment (see Figs. 1-5) The migration of DNA 
fragments within the gel is proportional to their size (length and/or molecular weight) - / ,.. 
larger fragments migrate more slowly (and thus form bands closer to the top of the gel), while 
smaller fragments migrate more quickly (and thus form bands closer to the gel bottom)" One can 
10 thus examine the films for the presence of one or more un.que bands in one lane of the gel (see 
arrows in Figs. 1 -4); the presence of a band in one lane (corresponding to a single sample" cell or 
tissue type) that is not observed in other lanes indicates that the DNA fragment comprising that 
unique band is source-specific and thus a potential tissue- or cell-specific cDNA or genomic 
genetic marker. 

15 Alternatively, DNA fragments may be visualized by staining the gel with a nucleic acid- 

b.nding stain such as ethidium bromide or silver stain. The DNA fragments are then visualized 
by illumination of the gel with a wavelength range of light specific for the stain used. e.g. UV 
illumination for ethidium bromide or visible linht for silver stain. • 



20 



Isolation and Characterization of 7 issue-Specific cDNAs and Genomic Genetic Markers 

A variety of tissue-specific cDNAs and genomic DNA fragments comprising genetic 
markers can thus be identified using the methods of the present invention by comparing the 
pattern of bands on the films depicting various samples. One can extend this approach, in another 
embodiment of the invention, to isolate and characterize these genetic markers. In this 
embodiment, one or more of the specific DNA fragments are removed from the dried gel which 
25 was used for identification (see above). Removal of these fragments from the gel may be effected 
by a number of means including electrocution or preferably by physical excision. This excision 
is preferably accomplished by overlaying the developed film (autoradiogram) directly over the 
dried gel, thus allowing the developed film to be used as a guide or template to localize the 
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fragments of interest in the gel. The fragments represented by unique bands on the autoradiogram 
may then be carefully cut from the dried gel through the corresponding band on the film using, 
for example, a scalpel, razor or scissors. The DN A is then eluted from the gel by incubating the 
slice for about 18-24 hours at 37°C in TE buffer. Following elution, the DNA sample in TE 
5 buffer is loaded into a syringe containing sterilized glass wool and filtered through the glass wool 
into a sterile tube via centrifugation at about 250-500 x g for about 10 minutes at about 20-25 °C. 
Alternatively, this filtration may be accomplished via other chromatographic methods that are well 
known in the art, such as using standard glass wool columns and peristaltic pumping. After being 
filtered through glass wool, the DNA-containing sample is filtered through a desalting/buffer 

10 exchange column (e.g., using SEPHADEX® or a pre-packed PD-10 column available from 
Pharmacia, Piscataway, New Jersey) according to the manufacturer's instructions. This 
desalting/buffer exchange step may be accomplished by other methods routine in the field, e.g., 
via batch dialysis, although the use of columns for this purpose overcomes the longer time 
required, higher cost and sample loss that often accompany standard dialysis methods. The unique 

15 cDNA or genomic DNA fragments may then be eluted from the desalting column in deionized, 
distilled water and lyophilized and stored at 4°C to -70°C until use. AJternativeiy. these AFLP- 
defined, tissue-specific fragments or genetic markers can be immediately dissolved in TE buffer 
and re-amplified as oi&lined above to increase their concentration. Prior to or following this 
amplification, the unique cDNA or genomic DNA fragments may be inserted into standard 

20 nucleotide vectors (such as expression vectors) suitable for transfection or transformation of a 
variety of prokaryotic (bacterial) or eukaryotic (yeast, plant or animal including human and other 
mammalian) cells. 

Use of I Jmque cDNA and Genomic DNA Fragments 

The tissue- or cell-specific cDNAs, or genomic DNA fragments comprising genetic 
25 markers, that are identified and isolated by the methods of the present invention may be further 
characterized, for example by cloning and sequencing (i.e., determining the nucleotide sequences 
of the cDNA or genomic DNA fragments), by methods described above and others that are 
standard in the art (see also U.S. Patent Nos. 4,962,022 and 5,498,523, which are directed to 
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methods of DNA sequencing) Alternatively, these fragments may be used for the manufacture 
of various materials in industrial processes, such as hybridization probes or therapeutic proteins 
(dependent upon transcription and translation of the DNA fragments, or the production of 
synthetic peptides orpfoteins with amino acid sequences deduced from the nucleotide sequences 
5 of the specific cDNAs or genetic markers) by methods that are well-known in the art. Production 
of hybridization probes from t.ssue-specific cDNAs and unique genomic DNA fragments will, for 
example, provide the ability for those in the medical field to examine a patient's cells or tissues for 
the presence of a particular genetic marker such as a marker of cancer, of an infectious or genetic 
disease, of a marker of embryonic development, or of a tissue-specific marker Particularly 

0 suitable for diagnosis by the methods of the present invention are genetic diseases such as cystic 
fibrosa hemophilia, Alzheimer's disease, schizophrenia, muscular dystrophy or multiple sclerosis. 
Also suitable for identification by the methods of the present invention are genetic markers 
associated with pathogenicity (e.g., snrulence genes) of microorganisms. In addition, the presence 
of genetic markers of schizophrenia in patient samples may be determined by the present methods 

1 Furthermore, such hybridization probes can be used to isolate DNA fragments from genomic 
DNA or cDNA libraries prepared from a different cell, tissue or organism for further 
characterization. In this application of the present invention, hybridization probes comprising the 
APLP-defined unique fragments identified above, or one or more oligonucleotide probes 
complementary to these fragments, are hybridized under conditions of stringent hybridization with 
genomic DNA or a first cDNA library prepared from a cell, tissue or organism, such as any of 
those described above As used herein, the term "stringent hybridization conditions" is defined, 
as is generally understood in the field, as incubation of the genomic DNA or first cDNA library 
with the hybridization probe(s) for 1 8-24 hours at about 42°C in a solution comprising 5X SSC 
( 1 X SSC = 1 50 mM NaCI, 1 5 mM trisodium citrate), 50 mM sodium phosphate (pH about 7.6), 
5X Denhardt's solution, 50% formamide, 10% dextran sulfate and 20 g/ml denatured, sheared 
salmon sperm DNA. Following hybridization, the samples may be washed in 0. IX SSC at about 
65°C to further reduce nonspecific background, and the unique genomic DNA or cDNA 
fragments so isolated may be amplified and characterized as described above. Together, these 
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abilities will assist medical professionals and patients in diagnostic and prognostic determinations 
as well as in the development of treatment and prevention regimens for these and other disorders. 

It should also be apparent that this method can be used to screen animal tissues to be 
subsequently used in medical procedures such as tissue or organ transplants, blood transfusions, 
5 zygote implantations and artificial inseminations. In such procedures, pre-screening of the subject 
tissues for the presence of particular genetic markers may improve the success of tissue or organ 
transplants (by decreasing the likelihood of rejection due to donor-recipient genetic 
incompatibility) and of zygote implantations (by eliminating the use of genetically defective 
zygotes). Similarly, use of these methods will reduce the chances of transmission of infectious 

10 diseases (c.jlt., hepatitis and AIDS) in medical procedures that are often prone to such 
transmission, such as blood transfusions and artificial insemination. Finally, use of the present 
invention for identification and isolation of unique tissue-specific cDNAs and genomic DNA 
fragments will assist in forensic science in such applications as crime-scene analysis of blood, 
tissue and body secretions containing small amounts of DNA, as well as in paternity testing. 

15 It will be readily apparent to one of ordinary skill in the relevant arts that other suitable 

modifications and adaptations to the methods and applications described herein are obvious and 
may be made without departing from the scope of the invention or any embodiment thereof 
Having now described the present invention in detail, the same will be more clearly understood 
by reference to the following examples, which are included herewith for purposes of illustration 

20 only and are not intended to be limiting of the invention. 

Examples 

Materials and Methods 

The following materials and methods were used for all examples: 

Human cDNA libraries of brain, kidney, leukocytes and liver were purchased from Life 
25 Technologies, Inc. (Rockville, Maryland). Purification of cDNA was performed by inoculating 
1 x 10 6 bacterial cells into 100 mJ of TB broth in a 250 ml flask, and incubating at 30°C overnight 
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as described in the GENETRAPPER™ (LTI) manual Bacterial cells were harvested by 
centrifugatton, and plasmid DNA purified from the resultant bacterial peliets as described (Lin c, 
a/.. Id). Bnefly, cell pellets were suspended in a TRIS-buffered SDS/EDTA solution, incubated 
on ice to allow disruption of cells, extracted with potassium acetate, and the extract clarified by 
5 centrifiigation. DNA in clarified supernatants was precipitated with absolute ethanol, pelleted, 
and resuspended in a TRIS-burTered EDTA (TE) solution. RNA in the samples was degraded 
with RNase A, and the DNA was extracted with phenol and re-precipitated with ethanol and 
pelleted by centrifiigation at 16,000 at 4'C for 10 minutes. Resultant pellets were suspended 
in TE prior to being used in all experiments 

0 Genomic DNA was isolated from leukocytes of four pairs of identical twins kindly 

provided by Dr. Yolken (Johns Hopkins University. Baltimore, Maryland). For each pair, one 
individual was normal while the other was diagnosed as schizophrenic or bipolar 

For analysis of cDNAs and genomic DNA fragments, the Life Technologies, Inc.. AFLP 
Analysis System I (Catalogue No 10544) was used as described (Lin, J.-J , and Kuo. j , FOCUS 
5 /7f2;:66-70 (1995)). Briefly, 500 ng of cDNA or genomic DNA, isolated as described above, 
were digested with EcoRl and Mvd, ligated with FcoKl and Mse\ adapters, and amplified via 
PGR using '^-labeled selective primers for EcoK\ (SEQ ID N0 2) and M.svl (SEQ ID NOs:4.5) 
as recommended by the manufacturer. Amplified fragments were separated by polyacrylamide 
gel electrophoresis, and a un.que DNA fragment was sliced from the sequencing gel. This unique 
fragment was amplified with [(CAUy-xoRl+O] (SEQ ID NO: I ) and [(CUA^I+O] (SEQ ID 
NO:3) primers, annealed into a pAMP-l vector, treated with UDG and introduced into 
transformation-competent E. coli DHI0B host cells (Life Technologies, inc.; Rockville, 
Maryland) by electroporat.on (Lin, J.-J., etal., FOCUS 7^:98-101 (1993)). After growth of 
the cells in selective media containing ampicillin, plasmid DNA was isolated and digested with 
restriction endonucleases to determine bacterial clones prior to sequencing the plasmid DNA using 
the dsDNA Cycle Sequencing System (Life Technologies, Inc.; Rockville, Maryland) according 
to manufacturer's instructions. 
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Example I 

To evaluate the ability of the AFLP-based method of the present invention to identify 
unique sequences in a cDNA library, different amounts of plasmid pCMVSPORT containing the 
chloramphenicol resistance gene (pCMVSPORTCAT) were added into 500 ng of cDNA isolated 
5 from a human brain cDNA library AFLP was performed as described above on samples of brain 
cDNA with or without pCMVSPORTCAT cDNA, and the restriction patterns of these samples 
determined by gel electrophoresis (Fig. 1) Two unique bands (arrows) were detected in the 
samples containing mixtures of human brain cDNA and pCMVSPORTCAT cDNA (Fig. I , lanes 
2-5), which were not found in samples containing only brain cDNA alone (Fig. I , lane 6). These 

10 bands co-migrated in the gel with two prominent bands found in a control sample containing only 
plasmid DN A (Fig. I Jane 1 ). One of these unique bands was excised from the gel, amplified, and 
cloned into E colt host cells via a pAMP-1 vector. Plasmid DNA isolated from transfected 
colonies was then run on a sequencing gel, and the nucleotide sequence of the unique fragment 
compared to known sequences in the GenBank sequence database. The cloned and amplified 

15 sequence was found to be homologous to the GenBank sequences of the chloramphenicol 
resistance gene. These results illustrate the ability of the AFLP-based method of the present 
invention to identify a library-specific DNA sequence. 

\ 

Example 2 

To further demonstrate the utility of the AFLP-based method of the present invention to 
20 identify tissue-specific DNA sequences, AFLP was performed as described above on samples of 
cDNAs isolated from human brain, kidney, leukocyte and liver libraries (obtained from 
GD3CO/BRL, Gaithersburg, Mainland), and the restriction patterns of these samples determined 
by gel electrophoresis (Fig. 2). At least one unique band (arrow) was detected in the samples 
from the brain cDNA library (Fig. 2, lanes 7, 8), which was not found in the samples prepared 
25 from the other tissue types (Fig. 2, lanes 1-6). This brain-specific fragment was excised from the 
gel, amplified and cloned into E. coli host cells using the pAMP-1 vector. Plasmid DNA purified 
from transfected colonies was sequenced via gel electrophoresis, and the sequence of the brain- 
specific fragment compared to known sequences in the GenBank sequence database. The cloned 
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and amplified brain-specific fragment was found to be homologous to a brain-specific cDNA 
previously reported (Adams et a/. . Nature i55.632-6.14 ( 1 092)). These results further illustrate 
the power of the AFLP-based method of the present invention in identifying a unique DNA 
sequence that is not found in other tissues 

5 Example 3 

To demonstrate the utility of the AFLP-based method of the present invention in isolating 
and identifying DNA from a whole tissue, oligonucleotide probes were prepared from 20 brain- 
specific sequences resolved as m Example 2 These oligonucleotides were then used to obtain a 
full-length brain cDNA from a whole brain library by hybridization using GENETRAPPER (LTl) 
0 Us,ng tnJ s approach, several cDN A clones were obtained, and one of these clones was sequenced 
by gel electrophoresis. Upon comparison with GenBank sequences, the isolated brain-specific 
cDNA was found to be identical to that reported previously from brain (Adams et */.. Nature 
355.632-634 (1992)). These results indicate that, in addition to its usefulness in identifying a 
DNA unique to a particular tissue type, the AFLP-based method of the present invention may be 
used for the isolation of a tissue-specific DNA fragment from a complex genome, tissue or cDNA 
library 

i 

Example 7 

To determine the efficacy of the present invention in distinguishing d.seased plant tissues 
from those that are not diseased, mRNA was isolated from soybean (Glycine max L Merr ) roots 
that were infected or not with cyst nematode. cDNA libraries were constructed from these 
mRNAs, and total cDNA was prepared from these two libraries and analyzed according to the 
present invention. Several unique DNA bands, identified in the cyst nematode-infected cDNA 
library but not in that from non-infected plants, were isolated from the sequencing gel These 
unique fragments were amplified to EcoRl and Msel primers, annealed to P AMP-, and 
transformed into £ col, as described above. After expansion of the cultures, plasmid DNAs 
containing the desired inserts were purified, blotted onto nylon membrane and hybridized to 
32 P-labeled cDNA prepared from the total RNA of either cyst nematode-infected or -noninfected 
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plants. Two plasmid DNAs showed strong hybridization signals with the cDNA probe prepared 
from infected plants but not with those from noninfected plants. Upon sequencing and GenBank 
comparison, these two clones showed significant homology to pea {Pisum sativum). These results 
demonstrate that disease-inducible genes are capable of being identified by the AFLP-based 
5 methods of the present invention. 

Example 5 

To demonstrate the utility of the present invention in examining genetic relationships 
between different organisms, studies were conducted in a variety of microorganisms such as E 
coIk Agrohaclerium spp, Xanthomonas, Pscudomonas, and Collectotrichum . Genomic DN A was 

10 prepared from these organisms, digested with restriction enzymes and analyzed as above. 
Representations of the phenogenetic relationships between these organisms (such as dendrograms) 
were prepared by densitometric scanning of the resultant autoradiogram and analyzing the 
similarity (i.e., calculating a "percent similarity index") using computer programs for DNA 
fingerprinting analysis such as that available from Bio-Rad (Hercules, California). The results of 

15 these studies demonstrate that DNA markers identified by the present invention provide a 
powerful means for the determination of familial genetic relationships between a variety of 
prokaryotic (and, by extension, eukaryotic) organisms. This technique should also prove useful 
for a determination of the distribution of infectious diseases throughout the world. Moreover, 
similar results can be achieved by applying this technique to cDNA libraries prepared from 

20 prokaryotic organisms. 

Example 6 

To demonstrate the utility of the present invention in identifying genetic markers, samples 
of genomic DNA from matched pairs of twins, wherein one individual was normal (unaffected) 
and the other was diagnosed as schizophrenic or bipolar, were analyzed by AFLP. As 
25 demonstrated in Figure 3, a number of potential genetic markers were identified between these 
four pairs of twins. More importantly, several such potential markers were detected between 
matched pairs of twins (arrows); these genetic markers were most evident in the matched pair 
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depicted in lanes 3 and 4 of Figure 3A These unique DNA fragments between the individuals in 
lanes 3 and 4 were consistently detected even when alternative primer pairs were used for AFLP 
analysis (Figure 3B; see lanes 3 and 4) These results indicate that the AFLP-based methods of 
the present invention provide a powerful way to identify genetic markers, based on subtle 
differences ^ genomic DNA, between individuals who may even be as closely related as identical 
twins. 

Example 7 

By the present invention, the genetic markers identified in Example 6 may be isolated and 
sequenced. Potential genetic markers such as those denoted by the arrows in Figure 3 are excised 
from the sequencing gels, amplified using universal AFLP primers and cloned into pAMP-1 as 
described above. The DNA sequences of these amplified genetic markers are then determined, 
using any of various sequencing methodologies that are well-known in the art (Maxam, A.M. and 
Gilbert. W , Proc. Natl. Acad. Sci. USA 74: 560-564 (1977); Sanger, F , et «/., Proa. Nad. Acad 
Sci. USA 7^:5463-5467 (1977)). Alternatively, sequencing of the genetic markers is 
accomplished using automated DNA sequencing apparatus After sequencing, PCR primer 
sequences are constructed as described in U S Patent Nos. 4,683,195; 4,683,202; and 4,800,159 
and used for amplification of other samples of genomic DNA for AFLP determination of the 
presence of genetic markers for schizophrenia. In this way, the methods provided by the present 
invention allow the drawing of a physical diagnosis of schizophrenia to complement the 
accompanying psychological diagnosis Moreover, isolated DNA sequences which have a 
functional open reading frame are used as target oligonucleotides for isolating and characterizing 
schizophrenia-related functional genes and the proteins encoded by such genes. 

Example 8 

Agrobactenum tumefaciens is a soil-dwelling plant pathogenic bacterium. The 
pathogenicity of certain strains of A. tumefaciens is imparted by the presence of the 
extrachromosomal Ti plasmid which is about 400 kilobases in size. Therefore, AFLP was 
performed on samples of genomic DNA from strains C58 (a pathogenic strain) and A 136 (a 
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nonpathogenic strain) of A. tumcfaciem. to determine the presence of genetic markers 
distinguishing these strains. 

As shown in Figure 4, several potential genetic markers were observed in strain C58 (lanes 
1 and 2\ see arrows) that were not present in strain Al 36 (lanes 3 and 4). Six of these pathogen- 
5 specific markers were isolated from the gels, cloned into pAMP-1 vectors as described above, and 
characterized by DNA sequencing and Southern blot hybridization. Upon comparison of these 
unique genetic markers with available sequences in GenBank, four of the AFLP-defined genetic 
markers observed in A. tumefaciens strain C58 were found to correspond to functional genes in 
the Ti plasmid (see Table 1). 

10 Table 1. Sequence Comparisons between clones containing AFLP-defined polymorphic 
DNA from A. tumefaciens strain C58 and genes from GenBank. 



A. tumefaciens 
strain 


Number of clones 
with AFLP-defined 
genetic markers 


Homologous genes 
from GenBank 


Genomic Southern 
blot hybridization 


C5X 


1 


acs gene within T-DNA 
of C58 nopalinc 
plasmid 




(bn 


2 


pTiC58 vtrB, v/rG. and 
virC genes of C58 




C58 


3 


pinFl and pinF2 genes 
from pTiA6 




C58 


4 


virE locus of pTiC58 


pTiC58, C58 genomic 
DNA 


C58 


5 


unidentified 


pTiC58, C58 genomic 
DNA 


C58 


6 


unidentified 


pTiC58. C58 genomic 
DNA 


A6 


I 


occR gene from pTiA6 




A6 


2 


ocotopine synthetase in 
T-DNA region of 
oTiAch5 
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To dete™ their possible identity, the two unknown AFLP-defincd genetic markers in 
strain C58 were further analyzed by examining their abilities to hybridize on Southern blots with 
plasm* or genomic DNA sequences from various strains of A. mmefaciens. As shown in Figure 
5, the two unknown AFLP-defined markers hybridized with Ti plasmid (lane 2) and eenomic 
(lanes 4, 8) DNA sequences isolated from C58. but not with genomic DNA isolated from 
A. tumefaciens strain A136 (lanes 5, 9) or with genomic DNA or Ti plasmid sequences isolated 
from A. tumefaaens octopine-like strains Ach5 (lanes 6, 1 0) and A6 (lanes 3.7.11) Toother 
these results indicated that AFLP ,s capable of determining the presence of potential genetic 
markers of pathogenicity or virulence in different strains of bacteria. 

Having now fully descnbed the present invention in some detail by way of illustration and 
example for purposes of clarity of understanding, it will be obvious to one of ordinary skill in the 
art that the same can be performed by modifying or changing the invention within a vv.de and 
equivaJent range of conditions, formulations and other parameters without affecting the scope of 
the invention or any specific embodiment thereof, and that such modifications or changes are 
intended to be encompassed within the scope of the appended claims 

All publications, patents and patent applications mentioned in this specification are 
indicative of the level of skill of those skilled in the art to which this invention pertains and are 
herein incorporated by reference to the same extent as if each individual publication, patent or 
patent application was specifically and individually indicated to be incorporated by reference 
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SEQUENCE LISTING 



U) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Life Technologies, Inc. 

(B) STREET: 9800 Medical Center Drive 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 20850 

(ii) TITLE OF INVENTION: Methods for Identification and Isolati 
Specific Nucleotide Sequences in cDNA and Genomic DNA 

(in) NUMBER OF SEQUENCES : 2 7 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US {to be assigned) 

(B) FILING DATE: 29-AUG-1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/028,519 
<B) FILING DATE: 18-OCTrl996 

i 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/024,864 

(B) FILING DATE: 30 -AUG- 1996 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 8 base pairs 

(B) , TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
CAUCAUCAUC AUGACTGCGT ACCAATTC 



(2) INFORMATION FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
GACTGCGTAC CAATTCACC 

19 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
CUACUACUAC UAGATGAGTC CTGAGTAA 

28 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
GATGAGTCCT GAGTAACAA 

19 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
GATGAGTCCT GAGTAACAC 19 
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WHAT IS CLAIMED IS 



I A method for identifying a DNA fragment from a first cDNA library or sample of 
genomic DNA, said DNA fragment not being present in a second cDNA library or sample of 
genomic DNA, said method comprising the steps of: 

(a) digesting a first and second cDNA libraries or samples of genomic DNA 
with at least one restriction enzyme to give a collection of restriction fragments: and 

(b) identifying one or more unique fragments from said first cDNA library or 
sample of genomic DNA by companng the fragments from said first library or sample to 
fragments from said second library or sample 

2. A method for identifying a DNA fragment from a second cDN A library or sample 
of genomic DNA, said DNA fragment not being present ,n a first cDNA library or sample of 
genomic DNA. said method comprising the steps of: 

(a) digesting first and second cDNA libraries or samples of genomic DNA with 
at least one restriction enzyme to give a collection of restriction fragments; and 

(b) identifying one or more unique fragments from said second cDNA library 
or sample of genomic DNA by companng the fragments from said second library or sample to 
fragments from said first library or sample. 

3 The method of claim I or claim 2, wherein said identifying step is accomplished 
by separating the restriction fragments according to size 

4 The method of claim 1 or claim 2, wherein said restnction fragments are amplified 
prior to said identifying step (b). 



5 The method of claim 1 or claim 2, where.n said restriction fragments are detectabiy 

labeled. 
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6. The method of claim 3, wherein said restriction fragments are amplified prior to 
said separation according to size. 

7. The method of claim 1 or claim 2, further comprising the steps of: 

(c) isolating at least one unique DNA fragment; and 

(d) inserting said DNA fragment into a vector. 

8. The method of claim 7. wherein said fragment is amplified prior to insertion into 
said vector. 



fragment 



The method of claim I or claim 2, further comprising sequencing said unique 



10 10 A method for isolating a DNA molecule from a first cDNA library or sample of 

genomic DNA. said method comprising the steps of: 

(a) mixing one or more the unique fragments identified according to claim 1 
or claim 2, or one or more oligonucleotide probes which are complementary to said fragments, 
with a first cDNA library or sample of genomic DNA under conditions stringent for hybridization 

15 of said unique fragments or oligonucleotide probes to said first cDNA library or sample of 
genomic DNA; and 

(b) isolating a DNA molecule which is complementary to said unique 
fragments or to said oligonucleotide probes. 



1 1 The method of claim 10, wherein said isolation step is accomplished by a method 
20 selected from the group of methods consisting of gel electrophoresis, density gradient 
centrifugation, sizing chromatography, affinity chromatography, immunoadsorption, and 
immunoaffinity chromatography. 
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12 The method of claim .0, further comprising sequencing said .so.ated DNA 

molecule. 



13 The method of claim 10. further comprising amplifying said isolated DNA 

molecule. 



H The mated of*™ ,0. Iurtncr compnsmg „ lseni „ g M , solated DNA 

into a vector. 



15. The method of claim 14, wherein said vector is 



an expression vector 



1 6 The method of claim 4. where,,, said amplification of sa.d reaction fragments i, 
accomplished by a method comprising: 

(a) ligating one or more adapter oligonucleotides to said unique restricts 
fragments to form a DNA-adapter complex; 

(b) hybridizing said DNA-adapter complex, under stringent condit.ons with 
one or more oligonucleotide primers which are complement^ to sa.d adapter port.on of said 
DNA-adapter complex to form a hybridization complex; and 

(c) amplifying said DNA-adapter complex 

17 The method of claim ,6. wherein said adapter oligonucleot.de conta.ns one or 
more restriction sites. 



18 The method of claim 1 7. wherein said restriction sites in said adapter are used to 
insert the DNA-adapter complex into a vector. 

19 The method of claim I or claim 2. wherein said firs, cDNA library or sample of 
genom.c DNA and said second cDNA library or sample of genomic DNA are derived from a 
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source selected from the group consisting of an individual cell, a tissue, an organ, and a whole 
organism. 

20. The method of claim 19, wherein said source is a prokaryotic cell. 

21. The method of claim 19, wherein said source is a eukaryotic cell. 
5 22. The method of claim 19, wherein said source is an animal tissue. 

23. The method of claim 19, wherein said source is a human tissue. 

24 The method of claim 19, wherein said source is a human embryo. 

25 The method of claim 19, wherein said source is a human fetus. 

26. The method of claim 19, wherein said source is a plant tissue. 

10 27. A method for identifying a genetic marker in a first cDNA library or sample of 

genomic DNA, said method comprising the steps of: 

(a) digesting first and second cDNA libraries or samples of genomic DNA with 
at least one restriction enzyme to give a collection of restriction fragments; and 

(b) identifying one or more unique DNA fragments from said first cDNA 
1 5 library or sample of genomic DNA by comparing the fragments from said first library or sample 

to fragments from said second library or sample. 

28. A method for identifying a genetic marker in a second cDN A library or sample of 
genomic DNA, said method comprising the steps of: 

(a) digesting first and second cDNA libraries or samples of genomic DNA with 
20 at least one restriction enzyme to give a collection of restriction fragments, and 
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(b) identifying one or more unique DNA fragments from said second cDNA 
horary or sample of genomic DNA by companng the fragments from said second library or sample 
to fragments from said first library or sample. 

The method of claim 27 or claim f,,^u^ , • • 

-/ or claim .8, further comprising sequencing said unique 



29 

DNA fragment 



30 The mated of claim 27 or claim 28. wherein aid restrict™ fragments 

amplified prior to said identifying step (b). 



3 I The method ofclaim 27 „ claim 28. where,,, so.d genetic marker ,s selected from 
the group consisting of a cancer marker, an infectious disease marker, a genetic disease marker 
a marker of embryonic development, a tissue-specific marker and an enzyme marker. 

32 The method of any one of claims I. 2. 27 or 28, wherein said firs. cDNA library 
or sample of genomic DNA is denved from a sample of an animal suffering from cancer and said 
second cDNA library or sample of genomic DNA is derived from a animal no, suffering from 

cancer. I 



33 The method of any one of claims I, 2. 27 or 28, wherein said first cDNA library 
or sample of genomic DNA is derived from a cancerous animal tissue and said second cDNA 
horary or sample of genomic DNA ,s denved from a noncancerous animal tissue 

34 The method of chum 33, wherein said first .ibrary or sample and said second library 
or sample are derived from the same animal. 



35 The method of any one of claims 1 . 2. 27 or 28, wherein said first cDNA library 
or sample of genomic DNA is derived from an amma. suffering from a genetic disease and said 
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second cDNA library or sample of genomic DNA is derived from an animal not suffering from 
said genetic disease. 

36. The method of claim 35, wherein said genetic disease is schizophrenia. 

37. The method of any one of claims 1, 2, 27 or 28, wherein said first cDNA library 
5 or sample of genomic DNA is derived from a diseased plant and said second cDNA library or 

sample of genomic DNA is derived from a non-diseased plant. 

38 The method of any one of claims I, 2. 27 or 28, wherein said first cDNA library 
or sample of genomic DNA is derived from a plant resistant to an environmental stress and said 
second cDNA library or sample of genomic DNA is derived from a plant not resistant to said 

10 environmental stress. 

39 The method of claim 36, wherein said environmental stress is selected from the 
group consisting of drought, excess temperature, diminished temperature, chemical toxicity by 
herbicides, pollution, excess light and diminished light. 

i 

40. The method of any one of claims 1, 2, 27 or 28, wherein said first cDNA library 
1 5 or sample of genomic DNA is derived from a pathogenic microorganism and said second cDNA 

library or sample of genomic DNA is derived from a nonpathogenic microorganism. 

41. The method of any one of claims 1, 2, 27 or 28, wherein said first cDNA library 
or sample of genomic DNA is derived from a organism producing an enzyme and said second 
cDNA library or sample of genomic DNA is derived from an organism not producing said 

20 enzyme. 

42. The method of claim 4 1 , wherein said enzyme is a restriction enzyme, an enzyme 
degrading a petroleum product, a biodegradative enzyme, a nucleic acid polymerase enzyme, a 
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nuclcc acid ligase enzyme, an amino acid S vn t he taS e enzyme or an enzyme .nvolved ,„ 

carbohydrate fermentation 



43, A meuW of defining ,„ e relanonship between a fir.,, individual and a second 
individual, said method comprising the steps of: 

(a) digesting cDNA libraries or samples of genomic DNA obtained from said 
first and second individuals with at least one restriction enzyme to give a collection of restriction 

fragments; 

(b) separating said restriction fragments from said firs, and said second 
individual according to size; and 

(O determining the similarities and dissimilarities of the sizes or concentrations 
of the restriction fragments separated in step (b) 
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